Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic Constituents
نویسندگان
چکیده
Simultaneous translation is a method to reduce the latency of communication through machine translation (MT) by dividing the input into short segments before performing translation. However, short segments pose problems for syntaxbased translation methods, as it is difficult to generate accurate parse trees for sub-sentential segments. In this paper, we perform the first experiments applying syntax-based SMT to simultaneous translation, and propose two methods to prevent degradations in accuracy: a method to predict unseen syntactic constituents that help generate complete parse trees, and a method that waits for more input when the current utterance is not enough to generate a fluent translation. Experiments on English-Japanese translation show that the proposed methods allow for improvements in accuracy, particularly with regards to word order of the target sentences.
منابع مشابه
Decoding with Syntactic and Non-Syntactic Phrases in a Syntax-Based Machine Translation System
A key concern in building syntax-based machine translation systems is how to improve coverage by incorporating more traditional phrase-based SMT phrase pairs that do not correspond to syntactic constituents. At the same time, it is desirable to include as much syntactic information in the system as possible in order to carry out linguistically motivated reordering, for example. We apply an exte...
متن کاملFactored Soft Source Syntactic Constraints for Hierarchical Machine Translation
This paper describes a factored approach to incorporating soft source syntactic constraints into a hierarchical phrase-based translation system. In contrast to traditional approaches that directly introduce syntactic constraints to translation rules by explicitly decorating them with syntactic annotations, which often exacerbate the data sparsity problem and cause other problems, our approach k...
متن کاملThird Workshop on Syntax and Structure in Statistical Translation
A key concern in building syntax-based machine translation systems is how to improve coverage by incorporating more traditional phrase-based SMT phrase pairs that do not correspond to syntactic constituents. At the same time, it is desirable to include as much syntactic information in the system as possible in order to carry out linguistically motivated reordering, for example. We apply an exte...
متن کاملRule-based Syntactic Preprocessing for Syntax-based Machine Translation
Several preprocessing techniques using syntactic information and linguistically motivated rules have been proposed to improve the quality of phrase-based machine translation (PBMT) output. On the other hand, there has been little work on similar techniques in the context of other translation formalisms such as syntax-based SMT. In this paper, we examine whether the sort of rule-based syntactic ...
متن کاملExtraction of Syntactic Translation Models from Parallel Data using Syntax from Source and Target Languages
We propose a generic rule induction framework that is informed by syntax from both sides of a parsed parallel corpus, as sets of structural, boundary and labeling related constraints. Factoring syntax in this manner empowers our framework to work with independent annotations coming from multiple resources and not necessarily a single syntactic structure. We then explore the issue of lexical cov...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015